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SUMMARY 

The past thirteen years have seen the development of many algorithms for approximating matrix 
functions in 0{N) time, where N is the basis size. These 0{N) algorithms rely on assumptions about 
the spatial locality of the matrix function; therefore their validity depends very much on the argument 
of the matrix function. In this article I carefully examine the validity of certain 0{N) algorithms when 
applied to hamiltonians of disordered systems. I focus on the prototypical disordered system, the 
Anderson model. I find that 0{N) algorithms for the density matrix function can be used well below 
the Anderson transition (i.e. in the metallic phase;) they fail only when the coherence length becomes 
large. This paper also includes some experimental results about the Anderson model's behavior across 
a range of disorders. 
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1. INTRODUCTION 

Certain matrix functions - the Green's function, the density matrix, and the logarithm - are 
very important to science and engineering. Where they are important, scientists are faced 
with a computational bottleneck: evaluation of a matrix function generally requires 0{N^) 
time, where N is the basis size and usually scales linearly or worse with the system volume. 
This prohibits computations of large systems, even with modern computers. In particular, if 
a matrix function must be recalculated at every step of a system's evolution in time, then 
calculations with a basis size larger than 0(1000 — 10000) are not practical. 

In 1991 W. Yang introduced the Divide and Conquer algorithm, which approximated a 
matrix function in 0{N) time|2J|- This stimulated the development of many other 0{N) 
algorithms 12 which have met considerable success, permitting for instance quantum 
dynamics calculations of tens of thousands of atoms. All 0{N) algorithms rely on special 
characteristics of the system under study, and the question of their validity can be answered 
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only after having specified that system. To date, theoretical studies of the validity of these 
algorithms have occurred almost exclusively within the conceptual framework of ordered 
systems, using ideas of metals, insulators, and band gaps [51 [3 El ISl ^1 d ^1 ^1 IT^ . 

In this paper I carefully examine the applicability of 0{N) algorithms to disordered systems. 
I calculate a matrix function in two ways: with an 0{N) algorithm, and via diagonalization. 
Comparison of the two results provides some new insight into when disorder can make 0{N) 
calculations feasible. 

This paper begins by introducing 0{N) algorithms (section 2) and then explaining how to 
measure the errors which these algorithms induce (section 3.) Section 4 introduces disordered 
systems and the density matrix function, while section 5 gives theoretical estimates of 0{N) 
errors. I present my numerical results in section 6, and finish in section 7 with a short 
assessment of their reliability. 

2. 0(N) Algorithms 

All 0{N) algorithms make three basic assumptions: 

• Existence of a Preferred, Local Basis. It is assumed that the system is best described in 
terms of a localized basis set. I here define a basis as localized if for any basis element 
\tjj), only a small number of positions x satisfy {x\ip) ^ 0. 

• Existence of a Distance Metric. There must be a way of computing the physical distance 
between any two basis elements | V') and | V') . 

• Locality of both the Matrix Function and its Argument. I call a matrix A local if, for 
every pair of basis elements \x) and \y) that are far apart, (x\A\y) = 0. Throughout this 
article I choose a simple criterion for being far apart: comparison with a radius i?. 

There are a number of ways for an 0{N) algorithm to exploit the three basic assumptions. 
In this article I focus on the class of algorithms based on basis truncation. This class includes 
Yang's Divide and Conquer algorithm '21', the "Locally Self- Consistent Multiple Scattering" 
algorithm 13 0], and Goedecker's "Chebychev Fermi Operator Expansion" 22, 23 , which I 
will henceforth call the Goedecker algorithm. Basis truncation algorithms break the matrix 
function into spatially separated pieces. Given the position of a particular piece, the basis 
is truncated to include only elements close to that position, and then the matrix function is 
calculated within the truncated basis. Thus, for any generic matrix function f{H), a basis 
truncation algorithm calculates {x\f{H)\y) = {x\f{Ps^yHPg^jj)\y), where P^^y is a projection 
operator truncating all basis elements far from x, y. There may be also an additional step of 
interpolating results obtained with different P's, but I will ignore this. In this paper I choose 
P to be independent of the left index x, and truncate all basis elements outside a sphere of 
radius R centered at y. 

Basis truncation algorithms vary only in their choice of how to evaluate the function 
f{Px,yHPg^ij). Specifically, Yang's algorithm calculates / in terms of the argument's 
eigenvectors, the Goedecker algorithm does a Chebyshev expansion of /, and the "Locally 
Self-Consistent Multiple Scattering" algorithm calculates the argument's resolvent or Green's 
function and then obtains / via complex integration. Because these approaches are all 
mathematically equivalent when applied to analytic functions, they should all converge to 
identical results, as long as one makes identical choices of which matrix function to evaluate, 
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of how to break up the function, of which projection operator to use, and of a possible 
interpolation scheme. Moreover, given an identical choice of matrix function, variations in 
the other choices should obtain results that are qualitatively the same. In this paper I use 
the Goedecker algorithm, but I want to emphasize that the results obtained here apply to the 
whole class of basis truncation algorithms. 

The Goedecker algorithm is essentially a Chebyshev expansion of the matrix function. As 
long as all the eigenvalues of the argument H are between 1 and —1, a matrix function may be 
expanded in a series of Chebyshev polynomials of H: f{H) = X]f=o CsTs{H). The coefficients 
Cs are independent of the basis size, and therefore can be calculated numerically in the scalar 
case. The Chebyshev polynomials can be calculated in 0{N) time using the recursion relation 
Ts+i — {2HTs) — Ts-i, Ti = H, To = 1. (Of course, one must also bound the highest and 
lowest eigenvalues of H and then normalize. In practice very simple heuristics are sufficient for 
estimating these bounds.) If the matrix function f{H) has a characteristic scale of variation 
a, then the error induced by the Chebyshev expansion is controlled by an exponential with 
argument of order —aS. 



3. Measuring The Error 

Previous efforts to test numerically the accuracy of 0{N) algorithms have been confined to 
evaluations of whether the overall physical predictions are reasonable, and tests of convergence 
with respect to the localization radius R and any other parameters. In contrast, in this paper 
I compute a matrix function using both an 0{N) algorithm and an algorithm based on 
diagonalization, and then compare the results. 

This careful comparison required development of a metric for comparing two matrices. First, 
note that the dot product used for comparing two vectors can be easily generalized to matrices: 

MDP{A,B) = Tr{AB) = ^ {x\A\y) {y\B\x) 

\i A — this matrix dot product is just square of the Frobenius norm, one of the traditional 
norms for matrices. Note also that matrix dot product is invariant under change of basis. 
Moreover, it is simple to show that — 1 < mdp{A,b) < 1 so one may define the 

' ^ - y/MDP(A,A)MDP(B,B) " 

angle 9 between two matrices as the arcsin of this quantity. Bowler and Gillanl^ justified 
this, showing that the concept of perpendicular and parallel matrices is valid and useful. 

However the matrix dot product is not quite suited to my needs. 0{N) algorithms have a 
preferred, local basis, and thus are not well matched by a basis invariant measure. Moreover, 
the matrix functions which they compute are expected to agree best with the exact matrix 
functions close to the diagonal, and to agree not at all outside the truncation radius. Therefore, 
a more sensitive metric is needed, one that distinguishes different distances from the diagonal. 
I define the Partial Matrix Dot Product as: 

MDPiA, B,x) = Y, {y\A\y + ^ {y\B\y + x) 

y 

The argument x of this dot product allows me to obtain information about agreement 
at displacement x from the diagonal. It is still valid to call this a dot product, because 
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the magnitude of mdp{A,b,x) bounded by one, and thus one can compute 

^ ^MDP{A,A,x)MDP{B,B,x) * 

a displacement-dependent angle 9{x) and relative magnitude m{x). The partial matrix dot 
product has a simple sum rule relating it to the full matrix dot product: AIDP(A, B) = 
Y.sMDP{A,B,x). 

In my results I actually compute another dot product, an angular average MDP{A, B,r) 
over all x satisfying r = \x\. 



4. The Density Matrix 

In this paper I restrict my attention to a single matrix function, the density matrix. This 
function is very important in quantum calculations of electronic structure in atoms and 
molecules, where its argument is the system's Hamiltonian, its diagonal elements describe 
the charge density, and its off-diagonal elements are used to compute forces on the atoms. 
Eigenvalues of the Hamiltonian give the energies of their corresponding electronic states, and 
I will use the words eigenvalue and energy interchangeably throughout the rest of this paper. 

The density matrix function T, H) is basically a projection operator which deletes 
eigenvectors having energy E larger than the Fermi level /i. Here I use the following form: 

For physical reasons, it is not quite a projection operator: it has a transition region around fj, 
of width proportional to the temperature T where its eigenvalues interpolate between and 1. 
The error induced by a Chebyshev expansion is controlled by an exponential with argument 
proportional to —TS/A, where A is the size of the energy band and S is the number of terms 
in the Chebyshev expansion|22| . 

The density matrix is well suited to 0{N) algorithms. As ^ becomes large, it converges 
to the identity. Moreover, it is invariant under unitary transformations acting on the set of 
eigenvectors with energies below the Fermi level|2]- Even when the Hamiltonian's eigenvectors 
are not a local basis set, often a unitary transformation can be found which maps them to a 
basis which is localized. If such a transformation exists, the density matrix is localized. 

Several papers have examined density matrix locality in the context of ordered systems; 
i.e. ones whose Hamiltonians possess lattice translational invariance|Sl [71 |S1 |H1 El El 
El El' (Lattice translational invariance can be expressed quantitatively as {x\H\y) = 
(x -I- A|iJ|y-|- A) for all A located on an infinitely extended lattice.) It is well known that 
the eigenvalues of such systems are arranged in bands separated by energy gaps where there 
are no eigenvalues, and that the eigenvectors are extended through all space. Notwithstanding 
the nonlocality of the eigenvectors, there are strong arguments for localization in all ordered 
systems. If the system is metallic (meaning that the Fermi level lies in one of the bands of 
eigenvalues) and the temperature is zero, then in a three-dimensional system the density matrix 
is expected to fall off asymptotically as R~^, where R is the spatial distance from the diagonal. 
A non-zero temperature multiplies this by an exponential decay. If instead the system is an 
insulator, then the density matrix should decay exponentially even at T = 0. 

Most systems of physical interest do not exhibit lattice translational symmetry. In particular, 
many exhibit inhomogeneities at scales much smaller than that of the system itself. These are 
termed disordered systems. In this article I study the prototypical disordered system, the 
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Anderson modelJSj. It describes a basis laid out on a cubic lattice, one basis element per 
lattice site, and a very simple symmetric Hamiltonian matrix H composed of two parts: 

• A regular part: {x\H\y) = 1 if x and y are nearest neighbors on the lattice. This term 
is, up to a constant, just the second order discretization of the Laplacian; its spectrum 
consists of a single band of energies between —2D and 2D, where D is the spatial 
dimensionality of the lattice. 

• A disordered part: Diagonal elements < x\H\x > have random values chosen according 
to some probability distribution. In this article I choose to use the Gaussian distribution 



I call a the disorder strength; this is related to the disorder strength used in the literature 
by a factor of 712111121. 

At small disorder strengths, the Anderson model is dominated by its regular part; in 
particular the eigenvectors are extended throughout the whole system volume. However, there 
is a small but important departure from the ordered behavior: at the band edges one finds 
a few eigenvectors with volumes much smaller than the system volume. In fact, there is 
an energy Elqc such that any eigenvector with eigenvalue E satisfying \E\ > Eloc is 
localized. On average these eigenvectors decay exponentially with the spatial distance from 
their maximum |17|. As the disorder strength is increased, E^oc g^ts smaller and smaller; 
i.e. more and more of the energy band becomes localized. At a critical disorder strength the 
whole energy band becomes localized. This phenomenon is called the Anderson transition; 
for the Gaussian probability distribution used in this paper it occurs at the critical disorder 
(7c = 6.149±0.006 16 . 

Note that these statements must all be understood as regarding the ensemble of 
Hamiltonians determined by the probabilistic distribution of the disorder: for instance, I am 
stating that above the critical disorder the subset of Hamiltonians with unlocalized eigenvectors 
is vanishingly small compared to the total ensemble size. Moreover, these statements are valid 
for an infinite lattice; the mapping to computations on a finite lattice is not always absolutely 
clear. 

Studies of the locality of disordered systems have traditionally concentrated on computations 
of the Green's function, not the density matrix. It is expected that the average of the Green's 
function should decay exponentially as exp(^), where R is called the coherence lengthjS]. 
The density matrix, as we will see, is closely related to the Green's function, so one may hope 
that its average will also decay exponentially. However, there are two reasons why this hope 
may be unjustified: First, we do not need to know whether the average of all density matrices 
is localized, but instead whether each individual density matrix is localized. The difference 
between the two could be significant. Second, in a system below the critical disorder there 
will be unlocalized eigenvectors, and one might therefore expect the behavior typical of 
a metal. 

Many 0{N) computations have treated systems which are disordered [221 El ESI- However, 
the 0{N) literature contains little theoretical material about the applicability of 0{N) 
algorithms to disordered systems. The originators of the "Locally Self-Consistent Green's 
Function" algorithm, which does not truncate the basis but instead does a sort of averaging 
outside of a radius r, suggested that r should be related to the coherence length^, and 



P{V) 



1 



:exp( 




Copyright © 2000 John Wiley & Sons, Ltd. 
Prepared using niaauth.cis 



Numer. Linear Algebra Appl. 2000; 00:0-0 



0(N) ALGORITHMS FOR DISORDERED SYSTEMS 



5 



also to the error induced by their aver aging |5- Zhang and Drabold computed the density 
matrix of amorphous SiHcon using exact diagonalization and found an exponential decay [T^. 
In the next sections, I will first argue theoretically and then show numerically that 0{N) 
basis truncation algorithms are applicable to disordered systems, including ones far below the 
Anderson transition. 



5. Estimating the Error 
The quantity of interest is the relative error, 

MDP[f, /, r) 

where R is the radius of the truncation volume and A(i?) is the difference between the exact 
matrix function / and the approximate matrix function f{R). 

A first guess can be made from the intuition that the absolute error 

E{r, R) = MDP{A{R),A{R),r) (2) 

is probably bounded by its value at the boundary of the truncation region. This allows a rough 
estimate of the relative error: e(r, R) ^ MDP{f^f r) ' ^^^SS^sting that it can be made arbitrarily 
small if the matrix function is localized. The numerator, however, is left undefined. Perhaps it 
is reasonable to assume that on the boundary the absolute error is equal to the approximate 
matrix function, giving: 

eir,R)<^^^^M^ (3) 
' ' ' - MDP{fJ,r) 

The following paragraphs develop further insight into the absolute error by resolving A{R) 
into a multiple sum over dot products between the argument's eigenstates {tp) and position 
eigenstates \x) . Knowledge of the normalization and asymptotic behavior of these states shed 
some light on the magnitude of the matrix elements of the error: {x\A{R)\x + r) . 

Basis truncation algorithms separate the basis into two projection operators. Pa for the part 
inside the localization cutoff i?, and Pb for the part outside R. P4 and Pb are then used to 
divide the argument H into two parts: a part i?o = PaHPa + PbHPb which leaves A and 
B disconnected, and a boundary term connecting A and B, Hi = PaHPb + PbHPa- The 
final result of a basis truncation algorithm is just PAf{Ho)PA- Therefore, the error induced 
by a basis truncation algorithm is entirely due to the boundary term Hi. In other words, 
PaA{R)Pa = PaU{Ho + Hi) - fiHo))PA. 

For matrix functions which are analytic on a region of the complex plane which contains the 
poles of H and Hq, an exact equation for this boundary effect can be easily derived from the 
Dyson equation. First define the Green's functions G{E) = [E — H)~^ , Go{E) = [E — Hq)~^ . 
Then note that the matrix function can be obtained from the Green's function through contour 
integration over the complex energy E: f{H) = ^§ G{E)f{E), where the complex integral 
must contain the poles of H. Next, apply the Dyson equation G = Gq + GHiGq twice to 
obtain: 

G = Go ^" GqHiGq -\- GqHiGHiGq 
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This gives an exact relation between the correct Green's function G of the untruncated 
argument H — Hq + Hi and the Green's function Go of the truncated argument Hq. In 
order to obtain a similar relation for the matrix function /, one must make the poles in this 
expression explicit and then do a complex integration. Define \a) and |6) as two eigenvectors 
of Hq which are both located inside of the localization region, the set of |c) as the complete 
set of eigenvectors of H, and Ea^ Eb, and Ec as their respective energies. Then: 



OO />OC /"OO 



{x\A{R)\x + r) ^ / / dEadEi,dEc{x\a){a\Hi\c){c\Hi\b){b\x + r)g{Ea,Eb,Ec) (A) 

•J — OO J — OO J — OO 

where 

g{Ea,Eb,Ec)^nA{Ea)nA{Eb)n{Ec) j>dEf{E)^^^^^^^^^ 

n{E) is the spectral density 5{E — Ec) and is often approximated as a continuous 
function. Similarly, nA{E) is the spectral density of the eigenstates of Hq which are located 
inside the truncation region. If the matrix elements and matrix function are well-behaved, then 
this integral is also well-behaved. Consider the integral 7 = / // dEadEbdEcg{Ea, Eb, Ec). 
When / is the density matrix and one uses a simple model with n equal to a constant ^ inside 

the energy band this integral is of order ^(^R^)'^ when ^ is inside the energy band 

and when it is outside the band. 

Assuming that all eigenvectors of both H and Ho arc unlocalized, one can use Eq. 0] to 
derive an upper bound on |(x|A(i?)|a; + r)\ oi order i?^. In the case of the density matrix this 
is a gross overestimate. Nonetheless Eq. 0]can teach three lessons: 

5.1. Localized systems 

Suppose that all the eigenvectors are bounded by exp (— )^ where Xq is the origin of 
the eigenvector and L is the minimum decay length of the system. Then one can use Eq. 0| 
to prove that |(a;|A(i?)|a; -I- f)| is bounded by a polynomial times exp(— ^^^) for large R, 
that the absolute error E{r,R) is bounded by a polynomial times exp (— ^^^^^^), and that 
the relative error can be made arbitrarily small. This suggests that in localized systems the 
absolute error E{r,R) depends exponentially on r. If so, then Eq. |31is a gross overestimate. 

5.2. Unlocalized systems 

If the eigenvectors are unlocalized, then the magnitudes of {x\a) and {b\x + r) have no strong 
dependence on x and r. This suggests that |(x|A(i?)|x -I- r)| is roughly independent of the 
position, and that E{r, R) is roughly independent of r, thus providing partial justification of 
Eq.El 

5.3. Finite coherence length 

Eq. 0] suggests that in systems with a finite coherence length r] the absolute error is reduced, 
via reduction of the matrix elements {a\Hi\c) and {c\Hi\h). I assume a very crude model of 
the incoherence where the eigenvectors are broken into domains with constant phase, each 
domain of size . The main effect is to decrease any integral over an eigenvector by a 
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Radius Disorder Strength 

Figure 1. The left graph shows the normahzed density matrix magnitude at fj, — 0. Each line 
corresponds to a different disorder strength between 0.65 and 9.00; lower disorders are higher on 
the graph. The right graph shows the ratio of the square root of the second moment to the mean at 
jj. — 0. The different lines correspond to different radii r = n\/3, with smaller radii lower on the graph. 

factor of y^Njf, where iV^ is the number of different domains where the integrand is non-zero. 

Hi touches about such domains. Therefore ii R > rj, {a\Hi\c) oc ^ and E{r,R) oc 

Analytic calculations of the second moment of the matrix element confirm the same scaling 
law. 



6. Results 

I studied ensembles of Anderson hamiltonians at eleven disorder values between a = 0.65 and 
cr — 9.00, including the critical disorder CTc — 6.15. In the results presented here a truncation 
radius of i? = 5 was used, but the results are qualitatively similar to those obtained with 
R = 1,2,3,4. A lattice size of 16'^ was used, and calculations with 12'^ and 20^ lattices at 
the critical disorder ac indicate that finite size effects are small. The largest such effect is an 
improvement of the basis truncation algorithm's accuracy at smaller lattice volumes. At each 
disorder I calculated the density matrix at 13 values of the Fermi level /it ranging uniformly 
from —12 to 12, which covered the whole energy band at the lower disorders, and most of it 
even at larger disorders. Close to the edges of the energy band the density matrix magnitude 
drops precipitously and the other observables also change rapidly; the main results reported 
here apply only to values of ^ where the spectral density remains high. 

A low temperature (T = 0.05) was chosen in order to minimize any temperature effect. A 
careful examination of the density matrix's behavior at ^ = showed that any temperature 
effect was swamped by other effects. In particular, at low disorder the density matrix's behavior 
is dominated by lattice effects. Because of the low temperature a large number of Chebyshev 
terms was needed; for each matrix I used a number of terms S equal to 25 times the total 
band width, which was enough to make the truncation error quite small. 

The left graph of figure ^ shows the normalized density matrix magnitude MBPip ^ ff) 
/i = 0. For r > and a > 1.65 a good fit to this quantity can be obtained by r^'^exp (— 1^), 
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Figure 2. Important volume scales. Each volume measure was averaged across the ensemble and across 
a small interval of the energy spectrum. Shown here are the maximum values of these averages. 



where the coherence length is given by ^ = (0.057CT - 0.089 - OMAa-^) . Note the almost 
inverse relation between the coherence length R and the disorder strength a. Lattice effects 
cause a systematic uncertainty in the first constant 0.057 of roughly 30%. Similar fits can be 
obtained at other Fermi levels within the band < 6; the first constant has a minimum at 
/i = and a total variation of about 30%. 

Now I consider the statistical distribution of the density matrix magnitude. The right graph 
of figure ^ shows the ratio of the square root of the second moment to the mean. Note that 
this ratio seems to grow roughly exponentially with the disorder a. An examination of the 
kurtosis (the normalized fourth moment of the statistical distribution) of the density matrix 
magnitude shows that for a > 5.15 this quantity becomes very large, starting at larger radii and 
larger Fermi level These statistics suggest that at the Anderson transition the statistical 
distribution of the density matrix magnitude develops a long tail; it loses its self-averaging 
property. 

Figure El shows two measures of eigenvector volume. The first is the inverse of the first 

participation ratio; i.e. |(-0|a;)|^) KV'I^?)!^- This quantity is a lattice friendly measure 
of volume because it has a minimum value of one when {4'\x) is a delta function and a maximum 
value of the system size when {ip\x) is a constant. Figure [3 shows that this volume becomes 
smaller than the truncation volume in the range cr = 3.00 to a = 4.00. 

-1 

My second volume measure is based on the second moment: ((27r) detQ) ^ , where Q is 
the second moment (or quadrupole tensor) of Unlike the first volume measure, this 

measure remains large even at large disorders, indicating that each eigenfunction consists of 
several isolated peaks scattered throughout the system volume. This structure is caused by the 
fact that states with similar energies will mix even if they are connected by exponentially small 
matrix elements. However, mixing caused by such small matrix elements does not influence 
the density matrix, because it essentially just induces a unitary transformation of the mixed 
eigenvectors, and as we know the density matrix is invariant under unitary transformations 
between states that are all either less than fi or more than fi. 

Figure 121 also shows the maximum value of the coherence volume; Vc — maxEV{E), where 
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V{E) is the coherence volume. I calculated this volume by first computing the correlation 

function C{x) = J dk |(A;|'0)| exp{ik ■ x), and then applying my two volume measures to 
C^(af). Taking the maximum value resolves an important ambiguity: at disorder strengths 
(T < 6.15 the coherence length shows two peaks at energies \E\ = 6 — 8. These peaks become 
very pronounced at cr < 1.65, where they are a factor of 10 above the minimum. Note that the 
coherence volume becomes small much sooner than the eigenvector volume, in the range from 
cr — 1.65 to cr = 2.65. For cr > 1.15 and Vc S> 1 it is roughly proportional to a^^. This fits 
well with the density matrix's coherence length at large cr, but not at small cr. 

Figure IHl shows the relative error and absolute error as a function of disorder. The 0{N) 
algorithm begins to work well at r = 0, 1 at quite small disorders, and the r = 2 relative error 
falls to 1% at about a » 3. Clearly the 0{N) algorithm's success is controlled by the coherence 
volume, not by the eigenvector volumes. 

The overlapping lines in the right hand graph of figure |31 confirm section l5.2l s argument that 
the absolute error E{r, R) is roughly independent of r, except at r = 0. The magnification at 
r = is probably caused by the density matrix's close relationship to the identity matrix. It is 
more than compensated for by a corresponding magnification of the density matrix at r = 0, 
so that the left hand graph shows that the value of the relative error e(r, i?) at r = is smaller 
than its value at r = 1. 

Section [^31 suggested that a small coherence length may cause a decrease in the absolute 
error E{r,R) of order The R — 5 line in the left hand graph of figure 01 shows that the 
decrease is actually even more pronounced: at the boundary r = R, E{r, R) is of the same 
order of magnitude as MDP{f,f,r), which is controlled by an exponential. This is just the 
ansatz used to obtain Eq.|3| my results fully support the validity of Eq.|3] except - as discussed 
before - at r = 0, and at the edges of the energy band. Therefore accurate estimates of the 
error of an 0(N) calculation may be obtained from the results of the 0{N) calculation itself. 

Section I^Tl showed that for large R the absolute error E{r, R) is bounded by a polynomial 
times exp {—^^^j^), where L is the decay length. This suggests an r dependence which is not 
confirmed by the right hand graph of figure [31 where it would cause a splitting of the lines 
at cr > 6.15. However, the previous paragraph showed that E{r,R) actually depends on R 
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as R~'^exp{^~), where R is the coherence length. Therefore as long as i? < 2L, the bound 
obtained in section f5.ll is automatically verified at all r. Moreover at small R the unknown 
polynomial in the bound may mask any r dependence. 

7. Reliability of These Results 

I have already discussed the errors due to the finite lattice size and to the truncation 
of the Cheybyshev expansion. While a precise analytical and numerical treatment of both 
error sources is possible, my checks indicate that they will have at most a quantitative, not 
qualitative, effect on the results of this paper. The main risks to the results probably lie in 
two areas: finite ensemble size, and software reliability and reproducibility. I have taken steps 
to manage both issues: 

7.1. Finite ensemble size. 

All results presented here were obtained from ensembles of 33 realizations, but repeating the 
calculations with ensembles of 100 realizations yielded the same results. At the critical disorder 
the same quantities were computed with three different lattice sizes (100 realizations at both 
12^ and 16'^, and 10 realizations at 20^), and the agreement is very good. Graphing any 
quantity across several disorders, one immediately notices that there is little noise induced 
overlap of the two graphs. Therefore, it seems likely that risks due to finite ensemble size are 
under control. 

7.2. Software Reliability and Reproducibility. 

I have tried very hard to reduce this risk. The software includes an automated test suite which 
tests all computational functions except the highest level output (graph printing) routines. 
Moreover, I have taken pains to enable other researchers to easily reproduce and check my 
results, simply by installing my software and the libraries it depends on, compiling it with the 
GNU gcc compilerfT^, and starting it running. The software, with all needed configuration 
files, is available under the GNU Public License 'SF; check www.sacksteder.com for further 
details. 
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